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INTRODUCTION 

Normal  and  malignant  human  mammary  epithelial  cells  are  able  to  synthesize  and  to  respond  to  various 
different,  locally  acting  growth  factors  trough  specific  receptors.  Among  these  are  the  type  1  family  of  growth 
factor  receptors,  which  consist  of  the  epidermal  growth  factor  receptor,  ErbB-2/Neu,  ErbB-3,  and  ErbB-4  1A. 
They  are  required  for  normal  mammary  development  and  lactation  and  are  aberrantly  expressed  in 
approximately  40%  of  breast  carcinomas.  Indeed,  in  human  breast  cancer  cases  the  prognosis  of  a  patient  is 
inversely  correlated  with  the  over  expression  and/or  amplification  of  this  receptor  family.  The  physiological 
ligand  for  these  receptors  has  been  shown  to  be  heregulin 5’6.  Interaction  of  heregulin  with  the  ErbB3  induces 
a  heterodimerization  between  ErbB2  and  ErbB3,  which  results  in  the  transphosphorylation  and  activation  of 
the  ErbB2  receptor.  Phosphorylation  of  this  receptor  initiates  signaling  cascades,  which  in  turn  can  impact 
upon  cell  function,  growth  and  division. 

Wilson  et  al.  7  have  identified  a  novel  nuclear  target  for  heregulin  signaling  which  responds  to  the 
growth  factor  treatment  of  cells  with  an  increase  ability  to  be  labeled  with  GTP.  They  identified  this  target  as 
the  20-kDa  subunit  of  the  nuclear  cap  binding  complex  (CBC)  and  demonstrated  that  the  CBC  is  stimulated  to 
bind  to  capped  RNAs  in  response  to  heregulin.  Based  on  these  observations  Wilson  et  al  suggested  that 
heregulin  could  impact  upon  cell  growth  by  modulating  gene  expression  at  the  level  of  RNA  processing  via 
the  CBC.  They  further  suggested  that  in  a  situation  where  the  heregulin  signal  is  constitutive,  the  active  CBC 
could  affect  gene  expression  by  amplifying  the  rate  of  RNA  processing,  and  thus  contribute  to  unregulated 
cell  growth  and  division. 

Eucaryotic  RNA  polymerase  II  transcripts  (mRNAs  and  U  snRNAs)  are  modified  with  the  addition  of 
a  methyl  guanosine  cap  in  a  5 ’-5’  triphosphate  linkage.  The  CBC  binds  cotranscriptionally  to  the 
monomethylated  cap-structure  and  is  required  for  the  efficient  (cap-dependent)  splicing  of  precursor 
mRNA8’9’10,  most  likely  by  promoting  the  assembly  of  the  commitment  complex  for  spliceosome  formation. 
The  CBC  also  plays  an  essential  role  in  the  nuclear  export  of  U  snRNAs11,12,  which  is  prerequisite  for  U 
snRNP  assembly  in  the  cytosol,  as  well  as  ensures  efficient  3 ’-end  processing13.  Recently,  a  role  for  the  CBC 
mediating  a  pioneer  round  of  translation  for  mRNAs  subject  to  nonsense-mediated  decay,  has  been 
described14.  Ultimately,  mature  and  error-proofed  transcripts  are  relayed  from  the  CBC  to  eIF-4E,  the 
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Cytosolic  cap-binding  component  of  the  translation  initiation  complex,  eIF-4F,  to  undergo  several  rounds  of 
translation  in  the  cytosol.  Both  mitogenic  and  stress  signals  activate  eIF-4E  in  order  to  stimulate  protein 
synthesis  15.  We  have  demonstrated  that  the  CBC  is  subject  to  similar  types  of  regulation7’16,  thus  raising  the 
likelihood  of  a  regulated,  temporal  processing  and  exchange  of  capped  mRNAs  between  the  CBC  and  eIF-4E. 
Given  the  essential  role  for  cap-binding  in  a  number  of  aspects  of  RNA  processing,  there  has  been  a  great  deal 
of  interest  in  understanding  the  molecular  basis  by  which  the  cap-structure  associates  with  these  proteins.  X- 
ray  crystallographic  structures  have  been  reported  for  m7GDP17,  m7GTP,  and  m7GpppG18  bound  to  eIF-4E 
providing  detailed  information  regarding  cap  recognition  by  this  cytosolic  cap-binding  protein.  An  X-ray 
structure  for  a  proteolyticially-derived  CBC  has  also  been  described19,  yet  structural  information  detailing 
interactions  between  the  nuclear  cap-binding  complex  and  its  cap-ligand  has  remained  elusive.  Thus,  the 
fundamental  questions  of  how  the  CBP20  subunit  specifically  recognizes  the  methylated  base  and  how 
CBP80  promotes  high  affinity  interactions  with  the  cap,  remain  to  be  addressed. 


BODY 

1.  Experimental  Procedures 

Protein  purification  and  crystallization  experiments. 

Purification  of  the  m7GpppG  bound  complex:  Recombinant  CBC  was  prepared  from  insect  cell  co¬ 
expressed  CBP20  and  CBP80.  Cells  were  lysed  using  0.4%  CHAPS  in  a  solution  containing  150  mM  NaCl, 
30  mM  Tris  (pH  8.0),  2  mM  DTT  and  1  mM  sodium  azide  (Buffer  A).  Lysates  were  centrifuged  at  40,000 
rpm  for  50  minutes,  and  the  supernatant  was  then  loaded  onto  a  Q-sepharose  resin  (Pharmacia)  and  eluted 
with  Buffer  A  using  a  NaCl  gradient  from  150-450  mM.  The  fraction  containing  CBC  was  then  bound  to  a 
m7GTP  sepharose  (Pharmacia)  column  and  eluted  with  200  (iM  m7GpppG  (New  England  Biolabs)  in  Buffer 
A.  The  m7GpppG-bound  CBC  was  applied  to  a  Q-resource  column  (Pharmacia)  and  further  purified  with  a 
250-450  mM  NaCl  gradient  in  Buffer  A  using  an  AKTA  system  (Pharmacia).  As  a  final  step,  the  protein  was 
loaded  onto  a  Superdex  200  (26/60  Pharmacia)  gel  filtration  column  in  Buffer  A  without  NaCl.  The  final 
CBC-m7GpppG  complex  was  assessed  to  be  >98%  pure.  Purification  of  the  unbound  complex  was  achieved 
by  substituting  the  m7GTP-sepharose  resin  with  a  butyl-sepharose  resin  (Pharmacia).  CBC  crystals  were 
obtained  at  4°C  by  sitting  drop  vapor  diffusion.  Two  microliters  of  protein  solution  (12  mg/ml)  were  mixed 
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with  the  same  volume  of  a  solution  containing  22%  PEG  400, 100  mM  Tris-HCl  (pH  7.25),  125  mM  MgCfe, 
10%  glycerol,  2  mM  DTT,  10  mM  glycine  and  3%  ethylene  glycol  to  produce  crystals  diffracting  up  to  2.1  A. 
Crystallization  of  the  unliganded  CBC  was  accomplished  using  the  same  conditions  as  above  except  that 
several  rounds  of  seeding  and  dehydration  were  performed  in  order  to  obtain  diffracting  crystals.  For 
cryocrystallography,  the  crystals  were  pre-soaked  in  a  stabilization  solution  containing  40%  PEG  400  and 
20%  glycerol,  mounted  in  nylon  loops  and  flash-frozen  in  liquid  nitrogen. 

Derivatization  of  crystals  with  krypton. 

Crystals  pre-soaked  in  cryoprotectant  solution  were  mounted  in  nylon  loops  and  placed  in  a  Hampton 
Research  Xenon  chamber  adapted  to  withstand  1000  psi  of  pressure.  Crystals  were  subject  to  800  psi  of 
pressure  using  different  protocols  for  6, 12, 24  or  48  hours,  with  longer  exposures  giving  the  best  results. 

After  derivatization,  crystals  were  immediately  flash-frozen  in  liquid  nitrogen. 

Diffraction  experiments  and  data  processing. 

All  diffraction  experiments  were  collected  using  synchrotron  radiation  at  the  Cornell  High  Energy 
Synchrotron  Source  (CHESS,  Ithaca  NY)  and  at  the  Argonne  National  Laboratory,  IMCA-CAT  (Argonne, 
Ill).  An  oscillation  step  of  0.5°  was  used  throughout,  and  the  crystal  to  detector  distance  varied  from  200  to 
225  mm.  Raw  reflection  intensities  were  reduced  with  Denzo  and  Scalepack27.  The  space  group  was 
determined  to  be  P3(2)21  for  both  structures  using  reciprocal  lattice  characteristics.  The  unit  cell  dimensions 
for  the  free  and  the  m7GpppG-bound  crystals  were  a=b=l  1 1.68  A,  c=177.58  A,  and  a=b=l  12.01  A,  c=175.59 
A,  respectively.  Both  crystals  contained  one  molecule  in  the  asymmetric  unit.  The  structure  of  CBC  was 
solved  by  the  molecular  replacement  method  (MOLREP28)  using  CBP80  (PDPID  code  1H6K, 19)  as  a  search 
model  and  the  phases  from  a  multiwavelength  anomalous  diffraction  experiment  using  krypton.  Krypton  sites 
were  found  using  SOLVE29  and  refined  using  SHARP30.  After  rigid  body  refinement  CNS31,  the  model  was 
subject  to  several  cycles  of  simulated  annealing  and  model  building  using  O  .  The  crystallographic  Rwork 
and  /ffree  values  for  the  CBC  in  complex  with  m7GpppG  are  22.3  and  24.5,  respectively  (40-2.1  A  data).  The 
Rwork  and  Rfree  values  for  the  substrate  free  complex  are  24.5  and  29.1,  respectively  (40-2.75  A  data).  Other 
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data-collection  and  refinement  statistics  are  shown  in  the  supplementary  information  (Table  3).  Figures  were 
generated  with  Bobscript 33  and  SPOCK 34  and  rendered  using  RASTER3D 35 . 


2.  Results 

We  have  determined  the  three  dimensional  structure  of  the  full-length  CBC  bound  to  m7GpppG  by 
molecular  replacement,  using  CBP80  as  an  initial  search  model19  and  the  phases  obtained  from  a  krypton 
multiple- wavelength  anomalous  dispersion  (MAD)  experiment  (Figure  1A).  The  atomic  structure  of  CBP20 
includes  residues  6-153  of  the  total  156,  and  conforms  to  the  classical  ribonucleotide  binding  domain  (RNP) 
fold  containing  four  anti-parallel  p-strands  arranged  in  the  order  p4-pl-p3-p2,  packed  against  the  aA  and  aB 
helices.  The  RNP-fold  is  comprised  of  two  conserved  motifs  designated  RNP1  (02,  residues  81-88)  and 
RNP2  (pi,  residues  41-46)  (Figure  IB;  grey).  Additions  to  the  RNP-fold  through  N-terminal  (helices  al,  a2, 
and  a3;  magenta)  and  C-terminal  (the  p4-aC  loop  and  helix  aC;  green)  insertions  contribute  substantially  to 
the  binding  of  the  7methyl  guanosine  cap.  These  extensions  enable  the  CBC  to  exhibit  functional  specificity 
relative  to  other  RNP-containing  proteins  like  the  U1 A  protein20  ’21,  the  Sex-lethal  protein22,  and  the  poly-(A)- 
binding  protein  which  all  bind  primary  and  secondary  elements  of  RNA  in  distinct  ways  but  are  incapable  of 
binding  the  cap-structure. 

CBP80  is  a  super-helical  structure  consisting  of  three  domains  (domain  1:  residues  23-244,  helices  1- 
11;  domain  2:  residues  309-478,  helices  15-24;  and  domain  3:  residues  498-790,  helices  25-37)  connected  by 
two  linkers  (Figure  1A).  The  N-terminal  domain  1  is  structurally  similar  to  the  middle  domain  of  eIF-4G 
(MIF4G)  which  plays  a  key  role  in  cap-dependent  mRNA  translation24.  The  atomic  structure  of  CBP80 
includes  residues  23-526,  538-667  and  687-790  (residues  1-22,  527-537  and  668-686  have  no  visible  electron 
density).  CBP80  does  not  directly  bind  to  nor  cover  the  cap-binding  site,  despite  the  well-documented 
requirement  of  this  subunit  for  the  high  affinity  binding  of  the  7methyl  guanosine  cap  to  CBP208. 
Nonetheless,  there  are  extensive  interactions  between  CBP80  and  the  cap-bound  CBP20  subunit  which  bury  a 
total  of  3176  A2  of  surface.  Helices  A  and  B  of  CBP20,  which  are  arranged  orthogonal  to  one  another  and 
form  an  underlying  layer  to  the  p-sheet  core  of  the  RNP  domains,  engage  domains  1  and  2  of  CBP80. 
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Additionally,  there  are  interactions  between  the  N-termini  of  CBP20  and  CBP80  that  contribute  to  cap¬ 
binding  and  will  be  detailed  below. 

Figure  2A  illustrates  the  electron  density  map  for  the  m7GpppG  cap-analog  and  Figure  2B  shows  a 
surface  representation  of  the  cap-binding  site  within  CBP20.  The  methylated  cap-analog  fits  within  a  deep 
cavity  of  —180  A3,  with  the  electrostatic  potential  switching  from  negative  to  positive  in  the  clockwise 
direction,  thus  counterbalancing  charges  contributed  by  the  cap-structure.  Recognition  and  binding  of  the  cap 
analog  is  achieved  in  two  ways.  First,  n-n  stacking  interactions  occur  between  the  electron  deficient  7methyl 
guanosine  ring  and  two  electron  rich  aromatic  residues,  Tyr  20  (loop  a2-a3)  and  Tyr  43  (RNP2).  The 
stacking  residues  and  the  methylated  base  are  oriented  along  three  planes  parallel  to  the  P-platform.  The  upper 
plane  is  formed  by  Tyr  20  and  is  3.25  A  away  from  the  methylated  base  (which  constitutes  the  middle  plane) 
(Figure  2C,  left)  and  overlaps  partially  with  the  guanine  ring  (Figure  2C,  right).  Given  that  the  average  van 
der  Waals  distance  for  n-n  stacking  interactions  is  3.4  A  (ref),  the  interaction  between  Tyr  20  and  the 
methylated  guanosine  ring  is  likely  to  be  strong.  The  lower  plane  is  formed  by  Tyr  43  which  is  separated 
from  the  methylated  base  by  3.4  A  (Figure  2C,  left)  and  its  degree  of  overlap  with  the  methylated  base  is 
larger  than  that  of  Tyr  20  (Figure  2C,  right).  Theoretical  data  and  experimental  studies  using  site  directed 
mutagenesis  suggest  that  the  area  of  overlap  between  the  stacking  residues  and  the  methylated  base  contribute 
significantly  to  the  binding  affinity25'27.  Methylation  of  the  guanine  ring  draws  electron  density  away  from 
the  imidazole  ring,  allowing  strong  interactions  with  the  side  chain  of  the  aromatic  residues  and  thereby 
enhancing  the  n-n  stacking  interactions.  In  addition  to  the  n-n  stacking  interactions,  there  are  a  number  of 
specific  interactions  between  the  cap-structure  and  residues  in  the  p4-aC  loop  (Figure  2D).  These  include 
hydrogen  bonds  with  the  7methyl  guanosine  that  mimic  Watson-Crick  pairing  between  guanosine  and 
cytosine,  and  hydrogen  bonds  and  van  der  Waals  contacts  with  the  ribose  and  phosphate  oxygen  atoms. 
Additionally,  there  is  a  stacking  interaction  between  Tyr  138  and  the  non-methylated  guanosine  moiety  (inter- 
planar  distance  =  3.6  A)  conferring  the  specificity  for  a  5’-5’  triphosphate  linkage.  Figure  2E  shows  that  the 
residues  responsible  for  binding  the  cap-structure  are  highly  conserved  among  different  species. 

Our  structural  data  explain  the  results  from  previous  mutagenesis  studies19.  For  example,  mutation  of 
the  n-n  stacking  residue  Tyr  43  to  alanine  in  CBP20  reduces  the  affinity  of  the  CBC  for  m7G-capped  U1 
snRNA  by  more  than  100  fold.  Likewise,  mutations  of  Asp  116  (from  the  (34-a6  loop)  and  Phe  83  (from  the 
p3  strand)  to  alanine  residues  cause  greater  than  100  fold  reductions  in  the  affinity  for  the  cap-structure.  Our 
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Structural  data  predict  that  mutating  Asp  116  to  Ala  disrupts  Watson-Crick  pairing  between  Asp  116(OD2) 
and  the  exocyclic  NH2,  as  well  as  potentially  alters  the  geometry  of  the  p4-a6  loop,  as  Asp  116  hydrogen 
bonds  with  Gly  118  and  Arg  123.  The  effect  of  changing  Phe  83  to  alanine  is  less  clear.  Given  that  Phe  83 
participates  in  C-H-0  bonds  with  atom  04  from  the  ribose,  the  disruption  of  these  interactions  could  decrease 
the  binding  affinity. 

The  interactions  between  the  CBC  and  the  methylated  base  described  above  support  the  observed  high 
affinity  interaction  between  methylated  cap  structures  and  the  CBC  (approximate  Kd  -  5  nM  7.  The  cavity 
where  the  methylated  base  binds  is  larger  in  the  case  for  CBP20  than  for  eIF-4E,  and  interplaner  distances 
between  the  methylated  base  and  the  residues  participating  in  the  n-n  stacking  interactions  are  shorter. 
Although  the  two  proteins  bind  the  methylated  base  by  satisfying  Watson-Crick  base  pairing,  the  overall 
number  of  hydrogen  bonds  between  CBP20  and  the  methylated  base,  the  ribose,  and  the  phosphate  oxygen  is 
greater  than  that  for  eIF-4E.  Thus,  it  is  not  surprising  that  the  CBC  binds  cap  structures  with  significantly 
higher  affinity  than  does  eIF-4E  (Kd  =  200  nM)28.  The  differences  between  the  CBC  and  eIF-4E  in  their 
binding  affinities  for  capped  RNA  speaks  to  their  divergent  roles  in  RNA  processing.  It  is  attractive  to 
envision  that  the  higher  affinity  of  CBP20  for  the  cap  makes  the  CBC  more  suitable  for  “chaperoning”  capped 
RNAs  through  sequential  RNA  processing  events  during  the  life  span  of  RNA  molecules  in  the  nucleus. 

A  key  question  concerns  how  access  to  the  cap-binding  site  of  the  CBC  is  regulated.  Such  regulation 
is  necessary  so  that  each  mature  mRNA  molecule  can  dissociate  from  the  CBC,  enabling  the  processed 
mRNA  to  bind  to  eIF-4E  and  to  be  translated  by  the  eIF-4F  complex,  as  well  as  ultimately  allowing  a 
precursor  mRNA  molecule  to  bind  to  the  CBC  in  place  of  the  mature  mRNA.  Comparisons  of  the  structures 
for  the  cap-bound  CBC  with  that  for  the  free,  unbound  protein  provide  some  insights  into  the  molecular  basis 
of  this  regulation.  Figures  3A  and  3B  show  the  electrostatic  surface  representations  of  CBP20  in  the  unbound 
and  bound  states,  respectively,  and  demonstrate  that  a  significant  conformational  change  is  associated  with  the 
binding  of  the  cap-analog.  The  blue  arrows  in  Figure  3A  show  the  location  of  the  empty  cap-binding  slot.  The 
transition  from  the  open  (unbound)  to  the  closed  (bound)  state  buries  a  previously  accessible  surface  area  of 
667  A2.  This  conformational  change  results  in  the  hinge-like  movement  of  the  amino-terminal  residues  16-30 
from  the  a2-a3  loop  of  CBP20  toward  the  p-sheets  comprising  the  RNP-fold.  In  the  open  state,  the 
conformationally-sensitive  a2-a3  loop  is  stabilized  by  intra-chain  interactions,  including  Asn  19(0)- Arg 

21(N),  Arg  2 1(0)- Asn  23(N),  and  Asn  23(0)-Phe  25(N),  whereas  the  closed  state  is  maintained  mainly 
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through  hydrogen  bond  interactions  between  Ser  18  and  Asp  114,  and  salt-bridges  between  Arg  21  and  Glu 
33,  and  between  Asp  22  and  Arg  127  (see  Figure  3C).  The  net  effect  of  the  conformational  change 
accompanying  cap-binding  is  to  rotate  the  a2-a3  loop  approximately  55  degrees,  forming  a  ‘roof  for  the  cap¬ 
binding  site  and  possibly  the  RNA-binding  groove  (Figure  3B;  the  black  arrows  point  to  a  possible  RNA- 
binding  groove  extending  from  the  cap-binding  site).  This  movement  enables  Tyr  20  to  enter  into  a  stacking 
position  together  with  Tyr  43,  thus  sandwiching  the  7methyl  guanosine  moiety  of  the  cap  (Figure  3C). 
Arginine  26  moves  nearly  11  A  from  its  position  in  the  unbound  CBP20  subunit,  such  that  Tyr  20  now 
partially  occupies  the  position  formerly  held  by  the  arginine  residue.  The  conformational  change  also  impacts 
the  C-terminal  residues  of  CBP20,  as  residues  128-156  within  this  region  show  no  detectable  electron  density 
in  the  unbound  or  open  state,  suggesting  that  they  are  disordered  or  flexible,  whereas  a  clear  electron  density 
is  observed  for  this  region  in  the  closed  (bound)  state.  Given  that  there  is  very  little  change  within  the  RNP- 
fold,  for  either  the  cap-bound  or  unbound  states,  an  attractive  possibility  is  that  this  region  serves  as  the  initial 
binding  site  for  the  cap-structure,  with  the  ensuing  conformational  change  then  bringing  Tyr  20  and  the 
carboxyl-terminal  residues  into  position  to  further  contribute  to  the  binding  interaction. 

The  necessity  for  CBP80  in  achieving  a  high  affinity  interaction  between  the  cap  structure  and  the 
CBC  can  now  clearly  be  appreciated.  Two  3-10  helices,  al  and  a2  from  CBP20,  which  are  connected  by  a 
short  loop  and  form  the  N-terminal  extension  to  the  RNP  domain  (see  Figure  4),  fit  into  a  binding  groove 
(shown  in  cyan)  within  CBP80.  This  groove  in  effect  serves  as  ‘fulcrum’  for  the  hinged  motion  of  the  a2-a3 
loop  that  accompanies  the  cap-binding  interaction  (see  Figure  4,  inset).  The  N-terminal  region  of  CBP20 
forms  no  other  contacts  with  CBP80,  aside  from  those  within  the  groove.  Thus,  it  appears  that  this  contact 
between  CBP80  and  CBP20  is  responsible  for  maintaining  the  N-terminus  of  CBP20  in  a  position  that  allows 
Tyr  20  to  participate  in  n-n  stacking  with  the  methylated  base. 

Given  what  is  known  about  the  growth  factor-dependent  regulation  of  cap  binding  to  eIF-4E, 
comparisons  between  the  two  major  cap-binding  protein  complexes  offer  some  plausible  mechanisms 
regarding  the  regulation  of  CBC  activation  and  capped  RNA  binding  imparted  by  extracellular  stimuli. 
Specifically,  the  structurally  similar  CBP80  and  eIF-4G  proteins  act  in  essence  as  regulatory  factors  to  ensure 
the  high  affinity  binding  of  the  cap-structure  to  CBP20  and  eIF-4E,  respectively.  Both  CBP80  and  eIF-4G 
make  important  contacts  with  the  p-sheet  core  and  portions  of  the  N-terminal  extensions  of  their  respective 
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binding  partners.  In  the  case  of  eIF-4E,  the  growth  factor-stimulated  phosphorylation  of  4E-BP1  29  30,  a 
regulatory  protein  that  binds  to  eIF-4E  and  significantly  weakens  its  affinity  for  capped  RNA,  results  in  the 
dissociation  of  4E-BP1  and  the  binding  of  eIF-4G  in  its  place  31  32  .  For  the  case  of  the  CBC,  the  CBP80 
subunit,  while  serving  a  function  analogous  to  eIF-4G,  differs  from  it  by  remaining  bound  to  the  cap-binding 
subunit  (CBP20)  in  both  its  high  affinity  (cap-bound)  and  low  affinity  (cap-free)  states.  However,  the  N- 
terminal  portion  of  CBP80,  which  constitutes  part  of  the  binding  groove  for  the  conformationally-sensitive 
loop  of  CBP20,  contains  consensus  phosphorylation  sites  for  the  ribosomal  p70  S6  kinase,  which  is  under 
growth  factor  control.  Thus,  CBP80  may  combine  features  of  4E-BP1  which  serves  as  a  growth 
factor/phosphorylation-sensor,  and  eIF-4G  which  helps  stabilize  the  conformational  state  for  the  high  affinity 
cap-binding  interaction.  It  is  worth  noting  that  the  N-terminal  portion  of  CBP80  also  contains  a  binding  site 
for  aimportin12.  We  have  observed  that  the  binding  of  a-importin  to  CBP80  is  translated  into  an  increased 
binding  affinity  for  capped  RNA  by  the  CBC,  whereas  the  binding  of  (3-importin  to  a-importin  may  weaken 
the  binding  of  the  capped  RNA  (K.W.  Cerione  and  G.  Calero,  unpublished  data).  Subtle  alterations  in  the 
position  of  the  N-terminal  portion  of  CBP80,  as  influenced  by  whether  a-importin  and/or  p-importin  (or  other 
as  yet  to  be  identified  CBC-binding  partners)  are  bound  to  that  region,  could  alter  the  hinge-like  movement  of 
the  conformationally-sensitive  loop  of  CBP20.  Such  alterations,  induced  by  a  growth  factor-  or  stress 
response-induced  phosphorylation  of  CBP80,  could  in  turn  be  translated  into  an  opening  of  the  cap-binding 
site,  thus  facilitating  the  exchange  of  a  newly  processed  capped  mRNA  for  another  precursor  mRNA 
molecule  and  thereby  providing  a  molecular  basis  for  the  growth  factor-  (or  stress-)  dependent  activation  of 
the  CBC. 
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KEY  RESEARCH  ACCOMPLISHMENTS 

Expression  of  CBC  in  SF9  cells. 

Purification  and  crystallization  of  CBC. 

Solution  of  the  atomic  structure  of  CBC  in  complex  with  m7GpppG  at  2.1  A  resolution. 
Refinement  of  the  CBC  structure  with  m7GpppG  with  a  final  RWRfactor  of  24.7/22.2  (Table  1) 
Solution  of  the  atomic  structure  of  the  CBC  without  the  substrate  m7GpppG  at  2.7  A  resolution. 
Refinement  of  the  CBC  without  m7GpppG  with  a  final  RWRfactor  of 29.1/24.7  (Table  1). 


REPORTABLE  OUTCOMES 
Submitted  manuscript: 

ATOMIC  STRUCTURE  OF  THE  NUCLEAR  CAP  BINDING  PROTEIN  (CBP20)  IN  COMPLEX 
WITH  CAP  BINDING  PROTEIN  80  (CBP80)  AND  THE  CAP  ANALOG  m7GpppG. 

G.A.  Calero,  K.F.  Wilson,  J.L.  Rios,  T.K.  Ly,  R.A.  Cerione  and  J.C.Clardy. 
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Figure  3C 
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figure  Legends 

Figure  1 .  (A)  Ribbon  diagrams  for  the  CBC  in  complex  with  the  cap  structural  analog  m7GpppG.  The  three 
structural  domains  of  CBP80  and  the  three  main  regions  of  CBP20  are  shown  as  is  the  m7GpppG  (ball-and- 
stick).  (B)  The  fold  of  CBP20  conforms  to  the  RNP  fold  (shown  in  gray  color)  with  the  following  topology 
al-a2-a3-pl-aA-p2-p3-aB-pA-pB-p4-aC  (bold  letters  denote  canonical  elements).  The  strands  of  the  p-sheet 
are  anti-parallel,  twisted  and  arranged  in  the  order  p4-pl-p3-p2.  The  two  middle  strands,  pi  (residues  41-46) 
and  p3  (residues  81-88),  form  the  RNP2  and  RNP1  motifs,  respectively.  Alpha  helices  aA  and  aB  (which 
interact  with  CBP80)  are  orthogonal  to  each  other  and  are  located  almost  horizontally  below  the  plane  of  the  p 
sheet  creating  a  layered  core.  N-terminal  insertions  (shown  in  magenta)  to  the  RNP  fold  comprise  two  short 
3 10  helices  (al  and  a2  that  interact  with  CBP80),  loop  a2-a3  that  contains  Y20  and  helix  a3.  C-terminal 
insertions  (shown  in  green)  participate  in  cap  binding  and  consist  of  a  semi-circular  loop  p4-aC  followed  by  a 
short  helix  (aC)  and  a  C-terminal  loop.  Also  shown  is  the  cap  analog  m7GpppG  and  the  residues  which 
participate  in  n-n  stacking  interactions  with  it,  Y20  (magenta)  and  Y43  (gray). 

Figure  2.  Interactions  of  CBP20  with  m7GpppG.  (A)  Electron  density  map  (2Fo-Fc)  for  the  methylated 
dinucleotide  m7GpppG  calculated  at  1 .2  sigma.  (B)  Surface  electrostatic  representation  calculated  with 
SPOCK32  for  the  m7GpppG-binding  cavity.  The  cavity  is  formed  from  residues  Y20  and  L36  (top),  L42  and 
Y43  (bottom),  and  F85,  R1 12,  D1 14,  W1 15,  D1 16,  Al  17  R123  and  R127  (semicircular  wall).  (C)  Ball  and 
stick  representation  of  n-n  stacking  and  degree  of  overlap  between  Y20,  m7GpppG  and  Y43  (see  text) 

(D)  Ball  and  Stick  representation  of  the  hydrogen  bond  network  stabilizing  m7GpppG.  Hydrogen  bonds  with 
7-methyl  guanine  include:  R1 12(NH1)-06=3.3A,  R1 12(NH2)-06=3.4A,  D1 14(OD2)-Nl=2.49A, 

D1 14(OD2)-N2=2.96A,  W1 15(0)-N2=2.96A  and  D1 16(OD2)-N2=3.13A.  2).  Hydrogen  bonds  with  the 
ribose  and  phosphate  oxygen  atoms  comprise:  R123(NH1)-02=2.7A,  R123(NH2)-02=3.26A,  R123(NH1)- 
03=2.37A,  R123(0)-03=2.78A,  R127(N)-02A=2.7lA,  G128(N)-02A=3.72A,  N133(NE2)-01A=2.69A, 

V 1 34(N)-02  A=3 .26  A,  Y20(OH)-OlB=2.8lA  and  F83(CE1)-04=3.5A,  F83(CE1)-04=3A  and  D1 16(OD2)- 
F85(CE2)=3.02A  3)  Intra-chain  interactions  include  hydrogen  bonds  D1 16(ODl)-Gl  18(N)=2.64A  and 
Y43 (OH)-N  1 3 3 (NE2)=2 . 64 A  and  a  salt  bridge  D1 16(OD2)-R123(NH2)=2.58A.  4)  Interactions  with  the  non- 
methylated  guanosine  include  stacking  interactions  between  Y138  and  the  guanine  ring  (inter-planar 
distance=3.6A)  and  the  C-H-0  bond  V134(CG1)-04=3.18A.. 
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(E)  Amino  acid  sequence  alignment  (MultAlin37)  using  Alscript38  for  human  and  selected  non-human 
CBP20s.  Conserved  residues  include  among  others:  1)  Most  residues  in  RNP1  (p2)  and  RNP2  (p3)  motifs.  2) 
Residues  involved  in  stacking  interactions  with  m7GpppG  (Y20,  Y43  and  Y138).  3)  Hinge  residues  (L17  and 
SI  8).  4)  Residues  involved  in  stabilization  of  the  closed  conformation  (Y20,  R21,  D22,  E33);  5)  Residues  in 
loops  p2-p3  and  6)  Residues  involved  in  hydrogen  bonding  with  the  cap  di-nucleotide  (see  text). 

Figure  3.  Conformational  change  associated  with  cap-binding  to  CBP20.  (A)  and  (B)  Electrostatic  surface 
representation  of  CBP20  in  the  open  and  closed  conformations  respectively.  Rotation  of  loop  1  upon  cap 
binding  forms  the  roof  of  the  cap  binding  cavity.  Blue  arrows  in  (A)  show  the  location  of  the  cap  binding  site 
(see  text);  black  arrows  in  (B)  show  a  possible  RNA  binding  surface  in  CBP20.  (C)  Analysis  of  the  free 
(open)  (red)  versus  the  substrate  bound  (closed)  (green)  conformations  of  CBP20  (Rms  deviations  on 
Ca=0.508  A)  reveals  considerable  structural  changes  associated  with  the  binding  of  the  m7GpppG  cap-analog. 
Calculation  of  RMS  distances  on  Ca  and  <p-vp  angles  for  the  two  conformations  suggest  that  the  N-terminus 
residues  16-29  from  loop  a2-a3  undergo  a  hinged  motion  towards  the  p-sheet.  The  net  effect  of  this  motion  is 
to  rotate  loop  a2-a3  approximately  55°  so  that  Y20  moves  about  8.3  A  on  top  of  Y43  to  engage  in  cap  binding 
while  Arg26  (that  was  occupying  Y20  position)  moves  outward  about  10.8 A.  In  addition,  a  small  rotation 
(opposite  to  the  direction  of  the  hinged  motion)  of  the  p-sheet  as  a  rigid  body  aligns  Y43  in  a  plane  almost 
parallel  to  the  plane  of  Y20. 

Figure  4.  Surface  (CBP80)  and  ribbons  (CBP20)  representation  to  illustrate  the  stabilization  of  the  N-terminal 
(magenta)  hinge  of  CBP20  by  CBP80.  The  N-terminus  of  CBP20  (residues  SI  1,  D12  and  S13  shown  in 
magenta)  crosses  through  a  “saddle”  formed  by  residues  K326  and  E327  in  CBP80  (cyan)  making  multiple 
contacts  that  fix  it,  allowing  the  hinged  motion  of  residues  16-29  in  loop  a2-a3.  The  N-terminal  region  of 
CBP80  is  in  blue.  The  insert  shows  in  detail  these  interactions  which  include  hydrogen  bonds  K326(NZ)- 
L9(0)  =  3 .3  A,  E327(OE2)-S  1 1  (OG)  =  2.93A  and  E327(OE2)-S13(N)  =  2.74A). 
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*  Table  1  Data  collection  and  refinement  statistics 


Native 

1 

+  cap 

Native  2 

+  cap 

Native 

-  cap 

Kr  XI  (edge) 

Kr  A2  (peak 

Kr  A3  (remote 

Beamline 

NX 

17-ID 

FI  CHESS 

F2  CHESS 

F2  CHESS 

F2  CHESS 

CHESS 

IMCA-CAT 

Resolution  (A) 

40-2.35 

40-2.1 

40-2.75 

40-2.65 

40-2.65 

40-2.65 

Wavelength  (A) 

0.9347 

1.00 

0.9407 

0.8651 

0.8659 

0.850 

Rmerge  (%) 

6.0 

5.8 

6.1 

7.5 

7.6 

7.7 

I/ctI 

3.4 

3.4 

4.5 

3.1 

3.1 

3.1 

Reflections 

54292 

70263 

51260 

34246 

34259 

34160 

Redundancy 

5.6 

6.5 

6.4 

12.8 

12.7 

12.4 

Completenes 

99% 

99% 

100% 

95% 

95% 

95% 

No  of  sites 

1 

1 

1 

Phasing  Power 

Iso  Ano 

Iso  Ano 

Iso  Ano 

NA  0.98 

1.24  1.23 

1.39  1.04 

Rworking 

22 

24.5 

RFree 

25 

29 

Monomers  /  AU 

1 

1 

Non-H  protein  atoms 

7521 

7487 

Non-H  ligand  atoms 

242 

125 

Number  of  waters 

375 

280 

Ramachandran  core 

90.0% 

84% 

Ramachandran  allowed 

10% 

16% 
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